Overview
Brought to you by YData
Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 16942236 |
| Missing cells | 1037769 |
| Missing cells (%) | 0.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 4.5 GiB |
| Average record size in memory | 283.7 B |
Variable types
| Numeric | 11 |
|---|---|
| Text | 2 |
| Categorical | 1 |
| Boolean | 1 |
department is highly overall correlated with department_id | High correlation |
department_id is highly overall correlated with department | High correlation |
tip_id is highly overall correlated with user_id | High correlation |
user_id is highly overall correlated with tip_id | High correlation |
days_since_prior_order has 1037769 (6.1%) missing values | Missing |
order_dow has 3277120 (19.3%) zeros | Zeros |
days_since_prior_order has 235000 (1.4%) zeros | Zeros |
Reproduction
| Analysis started | 2024-11-06 21:21:03.165629 |
|---|---|
| Analysis finished | 2024-11-06 21:26:21.625176 |
| Duration | 5 minutes and 18.46 seconds |
| Software version | ydata-profiling vv4.12.0 |
| Download configuration | config.json |
Variables
order_id
Real number (ℝ)
| Distinct | 1673021 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1710930.9 |
| Minimum | 1 |
|---|---|
| Maximum | 3421081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 170728.75 |
| Q1 | 854784 |
| median | 1711184 |
| Q3 | 2565863 |
| 95-th percentile | 3249789 |
| Maximum | 3421081 |
| Range | 3421080 |
| Interquartile range (IQR) | 1711079 |
Descriptive statistics
| Standard deviation | 987705.51 |
|---|---|
| Coefficient of variation (CV) | 0.57729129 |
| Kurtosis | -1.2003968 |
| Mean | 1710930.9 |
| Median Absolute Deviation (MAD) | 855542 |
| Skewness | -0.00083947472 |
| Sum | 2.8986995 × 1013 |
| Variance | 9.7556217 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2970392 | 121 | < 0.1% |
| 171934 | 104 | < 0.1% |
| 1867980 | 102 | < 0.1% |
| 1384519 | 102 | < 0.1% |
| 653887 | 102 | < 0.1% |
| 3052353 | 101 | < 0.1% |
| 3048680 | 100 | < 0.1% |
| 2716231 | 99 | < 0.1% |
| 1031566 | 95 | < 0.1% |
| 1730767 | 95 | < 0.1% |
| Other values (1673011) | 16941215 |
| Value | Count | Frequency (%) |
| 1 | 8 | < 0.1% |
| 2 | 9 | < 0.1% |
| 4 | 13 | |
| 5 | 26 | |
| 8 | 1 | < 0.1% |
| 10 | 15 | |
| 13 | 13 | |
| 15 | 5 | < 0.1% |
| 18 | 28 | |
| 19 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 3421081 | 7 | < 0.1% |
| 3421080 | 9 | |
| 3421079 | 1 | < 0.1% |
| 3421077 | 4 | < 0.1% |
| 3421073 | 2 | < 0.1% |
| 3421071 | 5 | < 0.1% |
| 3421067 | 1 | < 0.1% |
| 3421066 | 6 | < 0.1% |
| 3421064 | 3 | < 0.1% |
| 3421061 | 22 |
user_id
Real number (ℝ)
High correlation 
| Distinct | 103104 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 102720.62 |
| Minimum | 1 |
|---|---|
| Maximum | 206209 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 10396 |
| Q1 | 51127 |
| median | 102283 |
| Q3 | 154005 |
| 95-th percentile | 195960 |
| Maximum | 206209 |
| Range | 206208 |
| Interquartile range (IQR) | 102878 |
Descriptive statistics
| Standard deviation | 59447.466 |
|---|---|
| Coefficient of variation (CV) | 0.57872965 |
| Kurtosis | -1.1953846 |
| Mean | 102720.62 |
| Median Absolute Deviation (MAD) | 51474 |
| Skewness | 0.011557939 |
| Sum | 1.7403169 × 1012 |
| Variance | 3.5340012 × 109 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 137629 | 2931 | < 0.1% |
| 182401 | 2929 | < 0.1% |
| 33731 | 2912 | < 0.1% |
| 108187 | 2760 | < 0.1% |
| 79106 | 2631 | < 0.1% |
| 5360 | 2602 | < 0.1% |
| 17738 | 2596 | < 0.1% |
| 13701 | 2579 | < 0.1% |
| 72136 | 2536 | < 0.1% |
| 181991 | 2535 | < 0.1% |
| Other values (103094) | 16915225 |
| Value | Count | Frequency (%) |
| 1 | 70 | < 0.1% |
| 3 | 88 | < 0.1% |
| 5 | 46 | < 0.1% |
| 6 | 14 | < 0.1% |
| 7 | 215 | |
| 8 | 67 | < 0.1% |
| 10 | 147 | |
| 11 | 94 | |
| 14 | 221 | |
| 18 | 50 | < 0.1% |
| Value | Count | Frequency (%) |
| 206209 | 137 | < 0.1% |
| 206208 | 677 | |
| 206207 | 223 | < 0.1% |
| 206206 | 285 | |
| 206201 | 404 | |
| 206199 | 349 | |
| 206198 | 54 | < 0.1% |
| 206197 | 181 | < 0.1% |
| 206196 | 105 | < 0.1% |
| 206195 | 73 | < 0.1% |
order_number
Real number (ℝ)
| Distinct | 100 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.170166 |
| Minimum | 1 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 11 |
| Q3 | 24 |
| 95-th percentile | 54 |
| Maximum | 100 |
| Range | 99 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 17.51389 |
|---|---|
| Coefficient of variation (CV) | 1.0200187 |
| Kurtosis | 3.3432572 |
| Mean | 17.170166 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 1.7731862 |
| Sum | 2.90901 × 108 |
| Variance | 306.73634 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1037769 | 6.1% |
| 2 | 1026053 | 6.1% |
| 3 | 1025220 | 6.1% |
| 4 | 982271 | 5.8% |
| 5 | 874193 | 5.2% |
| 6 | 787618 | 4.6% |
| 7 | 713043 | 4.2% |
| 8 | 647741 | 3.8% |
| 9 | 595752 | 3.5% |
| 10 | 545087 | 3.2% |
| Other values (90) | 8707489 |
| Value | Count | Frequency (%) |
| 1 | 1037769 | |
| 2 | 1026053 | |
| 3 | 1025220 | |
| 4 | 982271 | |
| 5 | 874193 | |
| 6 | 787618 | |
| 7 | 713043 | |
| 8 | 647741 | |
| 9 | 595752 | |
| 10 | 545087 |
| Value | Count | Frequency (%) |
| 100 | 3762 | |
| 99 | 6392 | |
| 98 | 6599 | |
| 97 | 6947 | |
| 96 | 7139 | |
| 95 | 7380 | |
| 94 | 7831 | |
| 93 | 7936 | |
| 92 | 8386 | |
| 91 | 8832 |
order_dow
Real number (ℝ)
Zeros 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.7410518 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 3277120 |
| Zeros (%) | 19.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.0953411 |
|---|---|
| Coefficient of variation (CV) | 0.7644296 |
| Kurtosis | -1.3396147 |
| Mean | 2.7410518 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1768335 |
| Sum | 46439546 |
| Variance | 4.3904544 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3277120 | |
| 1 | 2929808 | |
| 6 | 2369792 | |
| 5 | 2204014 | |
| 2 | 2188126 | |
| 3 | 1998840 | |
| 4 | 1974536 |
| Value | Count | Frequency (%) |
| 0 | 3277120 | |
| 1 | 2929808 | |
| 2 | 2188126 | |
| 3 | 1998840 | |
| 4 | 1974536 | |
| 5 | 2204014 | |
| 6 | 2369792 |
| Value | Count | Frequency (%) |
| 6 | 2369792 | |
| 5 | 2204014 | |
| 4 | 1974536 | |
| 3 | 1998840 | |
| 2 | 2188126 | |
| 1 | 2929808 | |
| 0 | 3277120 |
order_hour_of_day
Real number (ℝ)
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.424394 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 116277 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 10 |
| median | 13 |
| Q3 | 16 |
| 95-th percentile | 21 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.2530374 |
|---|---|
| Coefficient of variation (CV) | 0.31681411 |
| Kurtosis | 0.00043555841 |
| Mean | 13.424394 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.049931377 |
| Sum | 2.2743925 × 108 |
| Variance | 18.088327 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 1437440 | 8.5% |
| 11 | 1429688 | 8.4% |
| 14 | 1414872 | 8.4% |
| 13 | 1396139 | 8.2% |
| 15 | 1387447 | 8.2% |
| 12 | 1371139 | 8.1% |
| 16 | 1328987 | 7.8% |
| 9 | 1271382 | 7.5% |
| 17 | 1088769 | 6.4% |
| 8 | 899191 | 5.3% |
| Other values (14) | 3917182 |
| Value | Count | Frequency (%) |
| 0 | 116277 | 0.7% |
| 1 | 63261 | 0.4% |
| 2 | 36604 | 0.2% |
| 3 | 27566 | 0.2% |
| 4 | 28132 | 0.2% |
| 5 | 46474 | 0.3% |
| 6 | 151987 | 0.9% |
| 7 | 467268 | 2.8% |
| 8 | 899191 | |
| 9 | 1271382 |
| Value | Count | Frequency (%) |
| 23 | 210431 | 1.2% |
| 22 | 334958 | 2.0% |
| 21 | 419307 | 2.5% |
| 20 | 511689 | 3.0% |
| 19 | 652876 | |
| 18 | 850352 | |
| 17 | 1088769 | |
| 16 | 1328987 | |
| 15 | 1387447 | |
| 14 | 1414872 |
days_since_prior_order
Real number (ℝ)
Missing  Zeros 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1037769 |
| Missing (%) | 6.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.343335 |
| Minimum | 0 |
|---|---|
| Maximum | 30 |
| Zeros | 235000 |
| Zeros (%) | 1.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 8 |
| Q3 | 15 |
| 95-th percentile | 30 |
| Maximum | 30 |
| Range | 30 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 8.9265116 |
|---|---|
| Coefficient of variation (CV) | 0.78693893 |
| Kurtosis | -0.20772786 |
| Mean | 11.343335 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.0076553 |
| Sum | 1.8040969 × 108 |
| Variance | 79.68261 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 1807177 | 10.7% |
| 30 | 1732380 | 10.2% |
| 6 | 1294189 | 7.6% |
| 5 | 1095107 | 6.5% |
| 4 | 1069148 | 6.3% |
| 8 | 1000535 | 5.9% |
| 3 | 957148 | 5.6% |
| 2 | 748607 | 4.4% |
| 9 | 634130 | 3.7% |
| 14 | 540232 | 3.2% |
| Other values (21) | 5025814 | |
| (Missing) | 1037769 | 6.1% |
| Value | Count | Frequency (%) |
| 0 | 235000 | 1.4% |
| 1 | 482217 | 2.8% |
| 2 | 748607 | |
| 3 | 957148 | |
| 4 | 1069148 | |
| 5 | 1095107 | |
| 6 | 1294189 | |
| 7 | 1807177 | |
| 8 | 1000535 | |
| 9 | 634130 | 3.7% |
| Value | Count | Frequency (%) |
| 30 | 1732380 | |
| 29 | 94961 | 0.6% |
| 28 | 136014 | 0.8% |
| 27 | 108168 | 0.6% |
| 26 | 95503 | 0.6% |
| 25 | 96798 | 0.6% |
| 24 | 104105 | 0.6% |
| 23 | 121144 | 0.7% |
| 22 | 165559 | 1.0% |
| 21 | 236470 | 1.4% |
product_id
Real number (ℝ)
| Distinct | 49258 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25577.236 |
| Minimum | 1 |
|---|---|
| Maximum | 49688 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3376 |
| Q1 | 13521 |
| median | 25256 |
| Q3 | 37947 |
| 95-th percentile | 47570 |
| Maximum | 49688 |
| Range | 49687 |
| Interquartile range (IQR) | 24426 |
Descriptive statistics
| Standard deviation | 14100.859 |
|---|---|
| Coefficient of variation (CV) | 0.55130504 |
| Kurtosis | -1.1420139 |
| Mean | 25577.236 |
| Median Absolute Deviation (MAD) | 12080 |
| Skewness | -0.02087385 |
| Sum | 4.3333556 × 1011 |
| Variance | 1.9883422 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 24852 | 244850 | 1.4% |
| 13176 | 197298 | 1.2% |
| 21137 | 138445 | 0.8% |
| 21903 | 126674 | 0.7% |
| 47209 | 111509 | 0.7% |
| 47766 | 91565 | 0.5% |
| 47626 | 80748 | 0.5% |
| 16797 | 74606 | 0.4% |
| 26209 | 74107 | 0.4% |
| 27966 | 71330 | 0.4% |
| Other values (49248) | 15731104 |
| Value | Count | Frequency (%) |
| 1 | 1021 | |
| 2 | 33 | < 0.1% |
| 3 | 130 | < 0.1% |
| 4 | 164 | < 0.1% |
| 5 | 4 | < 0.1% |
| 6 | 3 | < 0.1% |
| 7 | 23 | < 0.1% |
| 8 | 66 | < 0.1% |
| 9 | 79 | < 0.1% |
| 10 | 1282 |
| Value | Count | Frequency (%) |
| 49688 | 47 | < 0.1% |
| 49687 | 6 | < 0.1% |
| 49686 | 85 | < 0.1% |
| 49685 | 28 | < 0.1% |
| 49684 | 4 | < 0.1% |
| 49683 | 50205 | |
| 49682 | 60 | < 0.1% |
| 49681 | 28 | < 0.1% |
| 49680 | 501 | < 0.1% |
| 49679 | 60 | < 0.1% |
add_to_cart_order
Real number (ℝ)
| Distinct | 121 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.3960477 |
| Minimum | 1 |
|---|---|
| Maximum | 121 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 12 |
| 95-th percentile | 22 |
| Maximum | 121 |
| Range | 120 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 7.1715785 |
|---|---|
| Coefficient of variation (CV) | 0.85416123 |
| Kurtosis | 5.1324417 |
| Mean | 8.3960477 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.7895549 |
| Sum | 1.4224782 × 108 |
| Variance | 51.431538 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1673021 | 9.9% |
| 2 | 1591820 | 9.4% |
| 3 | 1494855 | 8.8% |
| 4 | 1387036 | 8.2% |
| 5 | 1271349 | 7.5% |
| 6 | 1152561 | 6.8% |
| 7 | 1034543 | 6.1% |
| 8 | 920714 | 5.4% |
| 9 | 815300 | 4.8% |
| 10 | 719304 | 4.2% |
| Other values (111) | 4881733 |
| Value | Count | Frequency (%) |
| 1 | 1673021 | |
| 2 | 1591820 | |
| 3 | 1494855 | |
| 4 | 1387036 | |
| 5 | 1271349 | |
| 6 | 1152561 | |
| 7 | 1034543 | |
| 8 | 920714 | |
| 9 | 815300 | |
| 10 | 719304 |
| Value | Count | Frequency (%) |
| 121 | 1 | |
| 120 | 1 | |
| 119 | 1 | |
| 118 | 1 | |
| 117 | 1 | |
| 116 | 1 | |
| 115 | 1 | |
| 114 | 1 | |
| 113 | 1 | |
| 112 | 1 |
product_name
Text
| Distinct | 49258 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 GiB |
Length
| Max length | 159 |
|---|---|
| Median length | 124 |
| Mean length | 25.020709 |
| Min length | 3 |
Unique
| Unique | 1230 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Soda |
|---|---|
| 2nd row | Organic Unsweetened Vanilla Almond Milk |
| 3rd row | Original Beef Jerky |
| 4th row | Aged White Cheddar Popcorn |
| 5th row | XL Pick-A-Size Paper Towel Rolls |
| Value | Count | Frequency (%) |
| organic | 5328597 | 8.3% |
| 1050303 | 1.6% | |
| milk | 905101 | 1.4% |
| cheese | 764222 | 1.2% |
| yogurt | 702792 | 1.1% |
| whole | 636286 | 1.0% |
| free | 604825 | 0.9% |
| original | 520726 | 0.8% |
| water | 517317 | 0.8% |
| baby | 516782 | 0.8% |
| Other values (11959) | 52990510 |
Most occurring characters
| Value | Count | Frequency (%) |
| 47687292 | 11.2% | |
| e | 39810195 | 9.4% |
| a | 36057670 | 8.5% |
| r | 29922629 | 7.1% |
| i | 25007978 | 5.9% |
| n | 23154655 | 5.5% |
| o | 18643600 | 4.4% |
| l | 17536874 | 4.1% |
| t | 17497147 | 4.1% |
| s | 14170037 | 3.3% |
| Other values (102) | 154418675 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 423906752 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 47687292 | 11.2% | |
| e | 39810195 | 9.4% |
| a | 36057670 | 8.5% |
| r | 29922629 | 7.1% |
| i | 25007978 | 5.9% |
| n | 23154655 | 5.5% |
| o | 18643600 | 4.4% |
| l | 17536874 | 4.1% |
| t | 17497147 | 4.1% |
| s | 14170037 | 3.3% |
| Other values (102) | 154418675 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 423906752 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 47687292 | 11.2% | |
| e | 39810195 | 9.4% |
| a | 36057670 | 8.5% |
| r | 29922629 | 7.1% |
| i | 25007978 | 5.9% |
| n | 23154655 | 5.5% |
| o | 18643600 | 4.4% |
| l | 17536874 | 4.1% |
| t | 17497147 | 4.1% |
| s | 14170037 | 3.3% |
| Other values (102) | 154418675 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 423906752 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 47687292 | 11.2% | |
| e | 39810195 | 9.4% |
| a | 36057670 | 8.5% |
| r | 29922629 | 7.1% |
| i | 25007978 | 5.9% |
| n | 23154655 | 5.5% |
| o | 18643600 | 4.4% |
| l | 17536874 | 4.1% |
| t | 17497147 | 4.1% |
| s | 14170037 | 3.3% |
| Other values (102) | 154418675 |
aisle_id
Real number (ℝ)
| Distinct | 134 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 71.174274 |
| Minimum | 1 |
|---|---|
| Maximum | 134 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 16 |
| Q1 | 31 |
| median | 83 |
| Q3 | 107 |
| 95-th percentile | 123 |
| Maximum | 134 |
| Range | 133 |
| Interquartile range (IQR) | 76 |
Descriptive statistics
| Standard deviation | 38.197941 |
|---|---|
| Coefficient of variation (CV) | 0.53668185 |
| Kurtosis | -1.3244584 |
| Mean | 71.174274 |
| Median Absolute Deviation (MAD) | 33 |
| Skewness | -0.16629414 |
| Sum | 1.2058514 × 109 |
| Variance | 1459.0827 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 1895765 | 11.2% |
| 83 | 1792105 | 10.6% |
| 123 | 922576 | 5.4% |
| 120 | 753242 | 4.4% |
| 21 | 510381 | 3.0% |
| 84 | 459496 | 2.7% |
| 115 | 439984 | 2.6% |
| 107 | 376282 | 2.2% |
| 91 | 331730 | 2.0% |
| 112 | 304667 | 1.8% |
| Other values (124) | 9156008 |
| Value | Count | Frequency (%) |
| 1 | 37794 | 0.2% |
| 2 | 43208 | 0.3% |
| 3 | 239790 | |
| 4 | 104544 | |
| 5 | 32540 | 0.2% |
| 6 | 19342 | 0.1% |
| 7 | 17508 | 0.1% |
| 8 | 18956 | 0.1% |
| 9 | 114889 | |
| 10 | 4860 | < 0.1% |
| Value | Count | Frequency (%) |
| 134 | 6032 | < 0.1% |
| 133 | 9715 | 0.1% |
| 132 | 3125 | < 0.1% |
| 131 | 139180 | |
| 130 | 82028 | |
| 129 | 101521 | |
| 128 | 100788 | |
| 127 | 21506 | 0.1% |
| 126 | 10370 | 0.1% |
| 125 | 18537 | 0.1% |
department_id
Real number (ℝ)
High correlation 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.9161701 |
| Minimum | 1 |
|---|---|
| Maximum | 21 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 9 |
| Q3 | 16 |
| 95-th percentile | 19 |
| Maximum | 21 |
| Range | 20 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 6.2813163 |
|---|---|
| Coefficient of variation (CV) | 0.63344176 |
| Kurtosis | -1.5597187 |
| Mean | 9.9161701 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.15272867 |
| Sum | 1.6800209 × 108 |
| Variance | 39.454934 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 4957175 | |
| 16 | 2816030 | |
| 19 | 1509667 | 8.9% |
| 7 | 1402323 | 8.3% |
| 1 | 1170018 | 6.9% |
| 13 | 984550 | 5.8% |
| 3 | 613344 | 3.6% |
| 15 | 560188 | 3.3% |
| 20 | 546160 | 3.2% |
| 9 | 452992 | 2.7% |
| Other values (11) | 1929789 | 11.4% |
| Value | Count | Frequency (%) |
| 1 | 1170018 | 6.9% |
| 2 | 19342 | 0.1% |
| 3 | 613344 | 3.6% |
| 4 | 4957175 | |
| 5 | 82145 | 0.5% |
| 6 | 141130 | 0.8% |
| 7 | 1402323 | 8.3% |
| 8 | 51742 | 0.3% |
| 9 | 452992 | 2.7% |
| 10 | 18036 | 0.1% |
| Value | Count | Frequency (%) |
| 21 | 39026 | 0.2% |
| 20 | 546160 | 3.2% |
| 19 | 1509667 | |
| 18 | 218996 | 1.3% |
| 17 | 386709 | 2.3% |
| 16 | 2816030 | |
| 15 | 560188 | 3.3% |
| 14 | 370020 | 2.2% |
| 13 | 984550 | 5.8% |
| 12 | 369467 | 2.2% |
aisle
Text
| Distinct | 134 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 GiB |
Length
| Max length | 29 |
|---|---|
| Median length | 23 |
| Mean length | 14.45503 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | soft drinks |
|---|---|
| 2nd row | soy lactosefree |
| 3rd row | popcorn jerky |
| 4th row | popcorn jerky |
| 5th row | paper goods |
| Value | Count | Frequency (%) |
| fresh | 4089173 | 11.4% |
| vegetables | 2879334 | 8.0% |
| fruits | 2827296 | 7.9% |
| packaged | 1672328 | 4.6% |
| frozen | 908224 | 2.5% |
| water | 879968 | 2.4% |
| yogurt | 753242 | 2.1% |
| ice | 529363 | 1.5% |
| cheese | 510381 | 1.4% |
| milk | 459496 | 1.3% |
| Other values (194) | 20505966 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 35002107 | |
| s | 23210298 | 9.5% |
| r | 20757851 | 8.5% |
| a | 19212352 | 7.8% |
| 19072535 | 7.8% | |
| t | 14378351 | 5.9% |
| f | 10534058 | 4.3% |
| i | 9796446 | 4.0% |
| o | 9247781 | 3.8% |
| g | 9154511 | 3.7% |
| Other values (16) | 74534237 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 244900527 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 35002107 | |
| s | 23210298 | 9.5% |
| r | 20757851 | 8.5% |
| a | 19212352 | 7.8% |
| 19072535 | 7.8% | |
| t | 14378351 | 5.9% |
| f | 10534058 | 4.3% |
| i | 9796446 | 4.0% |
| o | 9247781 | 3.8% |
| g | 9154511 | 3.7% |
| Other values (16) | 74534237 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 244900527 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 35002107 | |
| s | 23210298 | 9.5% |
| r | 20757851 | 8.5% |
| a | 19212352 | 7.8% |
| 19072535 | 7.8% | |
| t | 14378351 | 5.9% |
| f | 10534058 | 4.3% |
| i | 9796446 | 4.0% |
| o | 9247781 | 3.8% |
| g | 9154511 | 3.7% |
| Other values (16) | 74534237 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 244900527 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 35002107 | |
| s | 23210298 | 9.5% |
| r | 20757851 | 8.5% |
| a | 19212352 | 7.8% |
| 19072535 | 7.8% | |
| t | 14378351 | 5.9% |
| f | 10534058 | 4.3% |
| i | 9796446 | 4.0% |
| o | 9247781 | 3.8% |
| g | 9154511 | 3.7% |
| Other values (16) | 74534237 |
department
Categorical
High correlation 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 920.9 MiB |
| produce | |
|---|---|
| dairy eggs | |
| snacks | |
| beverages | |
| frozen | |
| Other values (16) |
Length
| Max length | 15 |
|---|---|
| Median length | 13 |
| Mean length | 7.997576 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | beverages |
|---|---|
| 2nd row | dairy eggs |
| 3rd row | snacks |
| 4th row | snacks |
| 5th row | household |
Common Values
| Value | Count | Frequency (%) |
| produce | 4957175 | |
| dairy eggs | 2816030 | |
| snacks | 1509667 | 8.9% |
| beverages | 1402323 | 8.3% |
| frozen | 1170018 | 6.9% |
| pantry | 984550 | 5.8% |
| bakery | 613344 | 3.6% |
| canned goods | 560188 | 3.3% |
| deli | 546160 | 3.2% |
| dry goods pasta | 452992 | 2.7% |
| Other values (11) | 1929789 | 11.4% |
Length
| Value | Count | Frequency (%) |
| produce | 4957175 | |
| dairy | 2816030 | |
| eggs | 2816030 | |
| snacks | 1509667 | 6.9% |
| beverages | 1402323 | 6.4% |
| frozen | 1170018 | 5.4% |
| goods | 1013180 | 4.6% |
| pantry | 984550 | 4.5% |
| bakery | 613344 | 2.8% |
| canned | 560188 | 2.6% |
| Other values (16) | 3984576 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 17263109 | |
| r | 13393276 | 9.9% |
| a | 11320813 | 8.4% |
| d | 11101901 | 8.2% |
| s | 10412021 | 7.7% |
| o | 10223843 | 7.5% |
| g | 8086589 | 6.0% |
| c | 7342351 | 5.4% |
| p | 6679635 | 4.9% |
| n | 5480203 | 4.0% |
| Other values (13) | 34193079 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 135496820 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 17263109 | |
| r | 13393276 | 9.9% |
| a | 11320813 | 8.4% |
| d | 11101901 | 8.2% |
| s | 10412021 | 7.7% |
| o | 10223843 | 7.5% |
| g | 8086589 | 6.0% |
| c | 7342351 | 5.4% |
| p | 6679635 | 4.9% |
| n | 5480203 | 4.0% |
| Other values (13) | 34193079 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 135496820 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 17263109 | |
| r | 13393276 | 9.9% |
| a | 11320813 | 8.4% |
| d | 11101901 | 8.2% |
| s | 10412021 | 7.7% |
| o | 10223843 | 7.5% |
| g | 8086589 | 6.0% |
| c | 7342351 | 5.4% |
| p | 6679635 | 4.9% |
| n | 5480203 | 4.0% |
| Other values (13) | 34193079 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 135496820 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 17263109 | |
| r | 13393276 | 9.9% |
| a | 11320813 | 8.4% |
| d | 11101901 | 8.2% |
| s | 10412021 | 7.7% |
| o | 10223843 | 7.5% |
| g | 8086589 | 6.0% |
| c | 7342351 | 5.4% |
| p | 6679635 | 4.9% |
| n | 5480203 | 4.0% |
| Other values (13) | 34193079 |
tip_id
Real number (ℝ)
High correlation 
| Distinct | 1673021 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1706272.9 |
| Minimum | 0 |
|---|---|
| Maximum | 3421082 |
| Zeros | 5 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 129.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 172187 |
| Q1 | 850523 |
| median | 1703692 |
| Q3 | 2559312 |
| 95-th percentile | 3251135 |
| Maximum | 3421082 |
| Range | 3421082 |
| Interquartile range (IQR) | 1708789 |
Descriptive statistics
| Standard deviation | 986164.09 |
|---|---|
| Coefficient of variation (CV) | 0.57796388 |
| Kurtosis | -1.1956622 |
| Mean | 1706272.9 |
| Median Absolute Deviation (MAD) | 854279 |
| Skewness | 0.0052109728 |
| Sum | 2.8908078 × 1013 |
| Variance | 9.7251962 × 1011 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 1327221 | 121 | < 0.1% |
| 1204060 | 104 | < 0.1% |
| 519395 | 102 | < 0.1% |
| 2115275 | 102 | < 0.1% |
| 1397638 | 102 | < 0.1% |
| 568301 | 101 | < 0.1% |
| 1801232 | 100 | < 0.1% |
| 2717969 | 99 | < 0.1% |
| 471303 | 95 | < 0.1% |
| 59030 | 95 | < 0.1% |
| Other values (1673011) | 16941215 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 1 | 6 | |
| 2 | 5 | |
| 3 | 5 | |
| 4 | 8 | |
| 5 | 4 | |
| 6 | 5 | |
| 7 | 6 | |
| 8 | 6 | |
| 9 | 9 |
| Value | Count | Frequency (%) |
| 3421082 | 8 | < 0.1% |
| 3421081 | 9 | |
| 3421080 | 20 | |
| 3421079 | 8 | < 0.1% |
| 3421078 | 9 | |
| 3421077 | 3 | < 0.1% |
| 3421076 | 12 | |
| 3421075 | 10 | |
| 3421074 | 2 | < 0.1% |
| 3421073 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| False | 9523085 | |
| True | 7419151 |
Interactions
Correlations
| add_to_cart_order | aisle_id | days_since_prior_order | department | department_id | order_dow | order_hour_of_day | order_id | order_number | product_id | tip | tip_id | user_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| add_to_cart_order | 1.000 | 0.006 | 0.075 | 0.034 | 0.016 | -0.015 | -0.013 | -0.001 | 0.003 | 0.008 | 0.020 | 0.003 | 0.003 |
| aisle_id | 0.006 | 1.000 | 0.006 | 0.438 | 0.021 | -0.002 | -0.002 | 0.000 | 0.001 | 0.005 | 0.060 | 0.001 | 0.001 |
| days_since_prior_order | 0.075 | 0.006 | 1.000 | 0.018 | -0.001 | -0.044 | -0.002 | -0.000 | -0.383 | 0.001 | 0.179 | 0.006 | 0.006 |
| department | 0.034 | 0.438 | 0.018 | 1.000 | 1.000 | 0.024 | 0.015 | 0.001 | 0.017 | 0.075 | 0.108 | 0.004 | 0.004 |
| department_id | 0.016 | 0.021 | -0.001 | 1.000 | 1.000 | 0.006 | -0.011 | -0.000 | 0.005 | -0.022 | 0.088 | 0.001 | 0.001 |
| order_dow | -0.015 | -0.002 | -0.044 | 0.024 | 0.006 | 1.000 | 0.012 | 0.000 | 0.015 | -0.003 | 0.143 | -0.003 | -0.003 |
| order_hour_of_day | -0.013 | -0.002 | -0.002 | 0.015 | -0.011 | 0.012 | 1.000 | 0.000 | -0.048 | 0.001 | 0.097 | -0.000 | -0.000 |
| order_id | -0.001 | 0.000 | -0.000 | 0.001 | -0.000 | 0.000 | 0.000 | 1.000 | -0.000 | -0.000 | 0.002 | -0.001 | -0.001 |
| order_number | 0.003 | 0.001 | -0.383 | 0.017 | 0.005 | 0.015 | -0.048 | -0.000 | 1.000 | -0.001 | 0.142 | -0.004 | -0.004 |
| product_id | 0.008 | 0.005 | 0.001 | 0.075 | -0.022 | -0.003 | 0.001 | -0.000 | -0.001 | 1.000 | 0.017 | -0.000 | -0.000 |
| tip | 0.020 | 0.060 | 0.179 | 0.108 | 0.088 | 0.143 | 0.097 | 0.002 | 0.142 | 0.017 | 1.000 | 0.009 | 0.010 |
| tip_id | 0.003 | 0.001 | 0.006 | 0.004 | 0.001 | -0.003 | -0.000 | -0.001 | -0.004 | -0.000 | 0.009 | 1.000 | 1.000 |
| user_id | 0.003 | 0.001 | 0.006 | 0.004 | 0.001 | -0.003 | -0.000 | -0.001 | -0.004 | -0.000 | 0.010 | 1.000 | 1.000 |
Missing values
Sample
| order_id | user_id | order_number | order_dow | order_hour_of_day | days_since_prior_order | product_id | add_to_cart_order | product_name | aisle_id | department_id | aisle | department | tip_id | tip | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2539329 | 1 | 1 | 2 | 8 | NaN | 196 | 1 | Soda | 77 | 7 | soft drinks | beverages | 0 | False |
| 1 | 2539329 | 1 | 1 | 2 | 8 | NaN | 14084 | 2 | Organic Unsweetened Vanilla Almond Milk | 91 | 16 | soy lactosefree | dairy eggs | 0 | False |
| 2 | 2539329 | 1 | 1 | 2 | 8 | NaN | 12427 | 3 | Original Beef Jerky | 23 | 19 | popcorn jerky | snacks | 0 | False |
| 3 | 2539329 | 1 | 1 | 2 | 8 | NaN | 26088 | 4 | Aged White Cheddar Popcorn | 23 | 19 | popcorn jerky | snacks | 0 | False |
| 4 | 2539329 | 1 | 1 | 2 | 8 | NaN | 26405 | 5 | XL Pick-A-Size Paper Towel Rolls | 54 | 17 | paper goods | household | 0 | False |
| 5 | 2398795 | 1 | 2 | 3 | 7 | 15.0 | 196 | 1 | Soda | 77 | 7 | soft drinks | beverages | 1 | False |
| 6 | 2398795 | 1 | 2 | 3 | 7 | 15.0 | 10258 | 2 | Pistachios | 117 | 19 | nuts seeds dried fruit | snacks | 1 | False |
| 7 | 2398795 | 1 | 2 | 3 | 7 | 15.0 | 12427 | 3 | Original Beef Jerky | 23 | 19 | popcorn jerky | snacks | 1 | False |
| 8 | 2398795 | 1 | 2 | 3 | 7 | 15.0 | 13176 | 4 | Bag of Organic Bananas | 24 | 4 | fresh fruits | produce | 1 | False |
| 9 | 2398795 | 1 | 2 | 3 | 7 | 15.0 | 26088 | 5 | Aged White Cheddar Popcorn | 23 | 19 | popcorn jerky | snacks | 1 | False |
| order_id | user_id | order_number | order_dow | order_hour_of_day | days_since_prior_order | product_id | add_to_cart_order | product_name | aisle_id | department_id | aisle | department | tip_id | tip | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16942226 | 2977660 | 206209 | 13 | 1 | 12 | 7.0 | 6567 | 8 | Chocolate Peanut Butter Protein Bar | 3 | 19 | energy granola bars | snacks | 3421081 | False |
| 16942227 | 2977660 | 206209 | 13 | 1 | 12 | 7.0 | 22920 | 9 | Roasted & Salted Shelled Pistachios | 117 | 19 | nuts seeds dried fruit | snacks | 3421081 | False |
| 16942228 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 6846 | 1 | Diet Pepsi Pack | 77 | 7 | soft drinks | beverages | 3421082 | False |
| 16942229 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 9405 | 2 | Calcium Enriched 100% Lactose Free Fat Free Milk | 91 | 16 | soy lactosefree | dairy eggs | 3421082 | False |
| 16942230 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 24852 | 3 | Banana | 24 | 4 | fresh fruits | produce | 3421082 | False |
| 16942231 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 40603 | 4 | Fabric Softener Sheets | 75 | 17 | laundry | household | 3421082 | False |
| 16942232 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 15655 | 5 | Dark Chocolate Mint Snacking Chocolate | 45 | 19 | candy chocolate | snacks | 3421082 | False |
| 16942233 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 42606 | 6 | Phish Food Frozen Yogurt | 37 | 1 | ice cream ice | frozen | 3421082 | False |
| 16942234 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 37966 | 7 | French Baguette Bread | 112 | 3 | bread | bakery | 3421082 | False |
| 16942235 | 272231 | 206209 | 14 | 6 | 14 | 30.0 | 39216 | 8 | Original Multigrain Spoonfuls Cereal | 121 | 14 | cereal | breakfast | 3421082 | False |